The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Deep neural operators can learn nonlinear mappings between infinite-dimensional function spaces via deep neural networks. As promising surrogate solvers of partial differential equations (PDEs) for real-time prediction, deep neural operators such as deep operator networks (DeepONets) provide a new simulation paradigm in science and engineering. Pure data-driven neural operators and deep learning models, in general, are usually limited to interpolation scenarios, where new predictions utilize inputs within the support of the training set. However, in the inference stage of real-world applications, the input may lie outside the support, i.e., extrapolation is required, which may result to large errors and unavoidable failure of deep learning models. Here, we address this challenge of extrapolation for deep neural operators. First, we systematically investigate the extrapolation behavior of DeepONets by quantifying the extrapolation complexity via the 2-Wasserstein distance between two function spaces and propose a new behavior of bias-variance trade-off for extrapolation with respect to model capacity. Subsequently, we develop a complete workflow, including extrapolation determination, and we propose five reliable learning methods that guarantee a safe prediction under extrapolation by requiring additional information -- the governing PDEs of the system or sparse new observations. The proposed methods are based on either fine-tuning a pre-trained DeepONet or multifidelity learning. We demonstrate the effectiveness of the proposed framework for various types of parametric PDEs. Our systematic comparisons provide practical guidelines for selecting a proper extrapolation method depending on the available information, desired accuracy, and required inference speed.
translated by 谷歌翻译
The mixture of Expert (MoE) parallelism is a recent advancement that scales up the model size with constant computational cost. MoE selects different sets of parameters (i.e., experts) for each incoming token, resulting in a sparsely-activated model. Despite several successful applications of MoE, its training efficiency degrades significantly as the number of experts increases. The routing stage in MoE relies on the efficiency of the All2All communication collective, which suffers from network congestion and has poor scalability. To mitigate these issues, we introduce SMILE, which exploits heterogeneous network bandwidth and splits a single-step routing into bi-level routing. Our experimental results show that the proposed method obtains a 2.5x speedup over Switch Transformer in terms of pretraining throughput on the Colossal Clean Crawled Corpus without losing any convergence speed.
translated by 谷歌翻译
Recently, neural networks have proven their impressive ability to solve partial differential equations (PDEs). Among them, Fourier neural operator (FNO) has shown success in learning solution operators for highly non-linear problems such as turbulence flow. FNO is discretization-invariant, where it can be trained on low-resolution data and generalizes to problems with high-resolution. This property is related to the low-pass filters in FNO, where only a limited number of frequency modes are selected to propagate information. However, it is still a challenge to select an appropriate number of frequency modes and training resolution for different PDEs. Too few frequency modes and low-resolution data hurt generalization, while too many frequency modes and high-resolution data are computationally expensive and lead to over-fitting. To this end, we propose Incremental Fourier Neural Operator (IFNO), which augments both the frequency modes and data resolution incrementally during training. We show that IFNO achieves better generalization (around 15% reduction on testing L2 loss) while reducing the computational cost by 35%, compared to the standard FNO. In addition, we observe that IFNO follows the behavior of implicit regularization in FNO, which explains its excellent generalization ability.
translated by 谷歌翻译
通用数据模型解决了标准化电子健康记录(EHR)数据的许多挑战,但无法将其集成深度表型所需的资源。开放的生物学和生物医学本体论(OBO)铸造本体论提供了可用于生物学知识的语义计算表示,并能够整合多种生物医学数据。但是,将EHR数据映射到OBO Foundry本体论需要大量的手动策展和域专业知识。我们介绍了一个框架,用于将观察性医学成果合作伙伴关系(OMOP)标准词汇介绍给OBO铸造本体。使用此框架,我们制作了92,367条条件,8,615种药物成分和10,673个测量结果的映射。域专家验证了映射准确性,并且在24家医院进行检查时,映射覆盖了99%的条件和药物成分和68%的测量结果。最后,我们证明OMOP2OBO映射可以帮助系统地识别可能受益于基因检测的未诊断罕见病患者。
translated by 谷歌翻译
线性系统的迭代求解器是部分微分方程(PDE)的数值解的关键组件。过去几十年来一直进行了深入的研究,例如雅各比,高斯 - 塞德尔,共轭梯度,跨部方法及其更高级的变体,但仍有迫切需要开发更快,更强大和更可靠的求解器。基于操作员回归的科学深度学习的最新进展,我们提出了一种提示,即用于微分方程的混合,迭代,数值和可转移的求解器。提示结合了标准放松方法和深层操作员网络(DeepOnet)。与标准数值求解器相比,提示能够为宽类微分方程提供更快的解决方案,同时保留接近机器零的精度。通过本本征分析,我们发现提示中的单个求解器靶向本征谱系中的不同区域,从而导致均匀的收敛速率,从而使混合求解器的整体表现出色。此外,提示适用于多维方程,并且在计算域和可转移到不同离散化方面具有灵活性。
translated by 谷歌翻译
许多遗传突变会不利地影响承重软组织的结构和功能,临床后遗症通常导致残疾或死亡。遗传学和组织力学表征的平行进步为这些条件提供了重要的见解,但是仍然需要整合此类信息。我们提出了一种新型的基因型到生物力学 - 原型神经网络(G2 {\ phi}净),用于表征和分类软组织的生物力学特性,这些特性是组织健康或疾病的重要功能读数。我们通过推断涉及细胞外成分缺陷或缺陷的四种小鼠模型的非线性,依赖基因型依赖性的本构行为来说明我们的方法的实用性。我们表明,G2 {\ phi} net可以通过利用有限,嘈杂和非结构化的实验数据来正确归因于相关的基因型,同时归因于相关的基因型。更广泛地说,G2 {\ phi}网络提供了一种强大的方法和范式转移,用于定量地将基因型和生物力学表型相关联,并有望更好地理解它们在生物组织中的相互作用。
translated by 谷歌翻译
我们可以将异源图结构与文本结合在一起以学习高质量的语义和行为表示吗?图形神经网络(GNN)S编码数值节点属性和图形结构,以在各种监督的学习任务中实现令人印象深刻的性能。当前的GNN方法受到文本特征的挑战,文本特征通常需要编码为数值向量,然后再提供给GNN,这可能会导致一些信息损失。在本文中,我们提出了一个有效有效的框架,称为语言模型GNN(LM-GNN),以共同训练大型语言模型和图形神经网络。我们的框架中的有效性是通过首先使用异质图信息,然后使用GNN模型应用BERT模型的阶段微调来实现的。提出了几种系统和设计优化,以实现可扩展有效的培训。 LM-GNN可容纳节点和边缘分类以及链接预测任务。我们在不同数据集的性能中评估了LM-GNN框架,并展示了所提出方法的有效性。 LM-GNN在亚马逊查询购买应用程序中提供竞争结果。
translated by 谷歌翻译
图形离群值检测是一项具有许多应用程序的新兴但至关重要的机器学习任务。尽管近年来算法扩散,但缺乏标准和统一的绩效评估设置限制了它们在现实世界应用中的进步和使用。为了利用差距,我们(据我们所知)(据我们所知)第一个全面的无监督节点离群值检测基准为unod,并带有以下亮点:(1)评估骨架从经典矩阵分解到最新图形神经的骨架的14个方法网络; (2)在现实世界数据集上使用不同类型的注射异常值和自然异常值对方法性能进行基准测试; (3)通过在不同尺度的合成图上使用运行时和GPU存储器使用算法的效率和可扩展性。基于广泛的实验结果的分析,我们讨论了当前渠道方法的利弊,并指出了多个关键和有希望的未来研究方向。
translated by 谷歌翻译
尽管几乎每种医学诊断和检查和检查应用中的广泛适应,但磁共振成像(MRI)仍然是慢的成像模态,其限制了其用于动态成像的用途。近年来,已利用平行成像(PI)和压缩传感(CS)加速MRI采集。在临床设置中,使用笛卡尔轨迹(例如直线采样)的扫描时间期间的k空间测量值是目前最常规的CS方法,然而,易于产生锯齿化重建。随着深度学习(DL)参与的出现,在加速MRI时,重建来自离心数据的忠实形象变得越来越有前途。回顾性地将数据采样掩模应用到k空间数据上是模拟真实临床环境中的k空间数据的加速获取的一种方式。在本文中,我们比较并提供审查对由训练的深神经网络输出的重建质量应用的效果进行审查。具有相同的超参数选择,我们训练并评估两个不同的反复推理机(轮辋),一个用于每种类型的重叠采样。我们的实验的定性和定量结果表明,具有径向子采样的数据培训的模型达到了更高的性能,并学会估计具有较高保真度的重建,为其他DL接近涉及径向辐射轮换。
translated by 谷歌翻译